Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 41(Database issue): D530-5, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23161678

RESUMO

The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new 'phylogenetic annotation' process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources.


Assuntos
Bases de Dados Genéticas , Genes , Anotação de Sequência Molecular , Vocabulário Controlado , Internet , Filogenia
2.
Database (Oxford) ; 2009: bap019, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-20157492

RESUMO

A major challenge for functional and comparative genomics resource development is the extraction of data from the biomedical literature. Although text mining for biological data is an active research field, few applications have been integrated into production literature curation systems such as those of the model organism databases (MODs). Not only are most available biological natural language (bioNLP) and information retrieval and extraction solutions difficult to adapt to existing MOD curation workflows, but many also have high error rates or are unable to process documents available in those formats preferred by scientific journals.In September 2008, Mouse Genome Informatics (MGI) at The Jackson Laboratory initiated a search for dictionary-based text mining tools that we could integrate into our biocuration workflow. MGI has rigorous document triage and annotation procedures designed to identify appropriate articles about mouse genetics and genome biology. We currently screen approximately 1000 journal articles a month for Gene Ontology terms, gene mapping, gene expression, phenotype data and other key biological information. Although we do not foresee that curation tasks will ever be fully automated, we are eager to implement named entity recognition (NER) tools for gene tagging that can help streamline our curation workflow and simplify gene indexing tasks within the MGI system. Gene indexing is an MGI-specific curation function that involves identifying which mouse genes are being studied in an article, then associating the appropriate gene symbols with the article reference number in the MGI database.Here, we discuss our search process, performance metrics and success criteria, and how we identified a short list of potential text mining tools for further evaluation. We provide an overview of our pilot projects with NCBO's Open Biomedical Annotator and Fraunhofer SCAI's ProMiner. In doing so, we prove the potential for the further incorporation of semi-automated processes into the curation of the biomedical literature.

3.
Cytogenet Genome Res ; 105(2-4): 240-50, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15237213

RESUMO

The transcriptome of the 2-cell mouse embryo was analyzed to provide insight into the molecular networks at play during nuclear reprogramming and embryonic genome activation. Analysis of ESTs from a 2-cell cDNA library identified nearly 4,000 genes, over half of which have not been previously studied. Transcripts of mobile elements, especially those of LTR retrotransposons, are abundantly represented in 2-cell embryos, suggesting their possible role in introducing genomic variation, and epigenetic restructuring of the embryonic genome. Analysis of Gene Ontology of the 2-cell-stage expressed genes outlines the major biological processes that guide the oocyte-to-embryo transition. These results provide a foundation for understanding molecular control at the onset of mammalian development.


Assuntos
Embrião de Mamíferos/fisiologia , Biologia de Sistemas , Animais , Ciclo Celular , Elementos de DNA Transponíveis , Embrião de Mamíferos/citologia , Desenvolvimento Embrionário/genética , Desenvolvimento Embrionário/fisiologia , Etiquetas de Sequências Expressas , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Biblioteca Gênica , Genes , Genômica , Masculino , Camundongos , Complexo de Endopeptidases do Proteassoma , RNA Mensageiro , Retroelementos , Reação em Cadeia da Polimerase Via Transcriptase Reversa
4.
Pac Symp Biocomput ; : 238-49, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-14992507

RESUMO

There has been increased work in developing automated systems that involve natural language processing (NLP) to recognize and extract genomic information from the literature. Recognition and identification of biological entities is a critical step in this process. NLP systems generally rely on nomenclatures and ontological specifications as resources for determining the names of the entities, assigning semantic categories that are consistent with the corresponding ontology, and assignment of identifiers that map to well-defined entities within a particular nomenclature. Although nomenclatures and ontologies are valuable for text processing systems, they were developed to aid researchers and are heterogeneous in structure and semantics. A uniform resource that is automatically generated from diverse resources, and that is designed for NLP purposes would be a useful tool for the field, and would further database interoperability. This paper presents work towards this goal. We have automatically created lexical resources from four model organism nomenclature systems (mouse, fly, worm, and yeast), and have studied performance of the resources within an existing NLP system, GENIES. Using nomenclatures is not straightforward because issues concerning ambiguity, synonymy, and name variations are quite challenging. In this paper we focus mainly on ambiguity. We determined that the number of ambiguous gene names within the individual nomenclatures, across the four nomenclatures, and with general English ranged from 0%-10.18%, 1.187%-20.30%, and 0%-2.49% respectively. When actually processing text, we found the rate of ambiguous occurrences (not counting ambiguities stemming from English words) to range from 2.4%-32.9% depending on the organisms considered.


Assuntos
Inteligência Artificial , Biologia Computacional , Processamento de Linguagem Natural , Terminologia como Assunto , Bases de Dados Genéticas , Genômica/estatística & dados numéricos , Modelos Genéticos
5.
Nucleic Acids Res ; 32(Database issue): D258-61, 2004 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-14681407

RESUMO

The Gene Ontology (GO) project (http://www. geneontology.org/) provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences. Many model organism databases and genome annotation groups use the GO and contribute their annotation sets to the GO resource. The GO database integrates the vocabularies and contributed annotations and provides full access to this information in several formats. Members of the GO Consortium continually work collectively, involving outside experts as needed, to expand and update the GO vocabularies. The GO Web resource also provides access to extensive documentation about the GO project and links to applications that use GO data for functional analyses.


Assuntos
Bases de Dados Genéticas , Genes , Terminologia como Assunto , Animais , Bibliografias como Assunto , Correio Eletrônico , Genômica , Humanos , Armazenamento e Recuperação da Informação , Internet , Biologia Molecular , Proteínas/classificação , Proteínas/genética , Software
6.
Nature ; 420(6915): 563-73, 2002 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-12466851

RESUMO

Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.


Assuntos
DNA Complementar/genética , Genômica , Camundongos/genética , Transcrição Gênica/genética , Processamento Alternativo/genética , Motivos de Aminoácidos , Animais , Cromossomos de Mamíferos/genética , Clonagem Molecular , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Genes/genética , Genômica/métodos , Humanos , Proteínas de Membrana/genética , Mapeamento Físico do Cromossomo , Estrutura Terciária de Proteína , Proteoma/química , Proteoma/genética , RNA Antissenso/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , RNA não Traduzido/análise , RNA não Traduzido/genética , Sítio de Iniciação de Transcrição
9.
Nucleic Acids Res ; 29(1): 91-4, 2001 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-11125058

RESUMO

The Mouse Genome Database (MGD) is the community database resource for the laboratory mouse, a key model organism for interpreting the human genome and for understanding human biology and disease (http://www.informatics.jax.org). MGD provides standard nomenclature and consensus map positions for mouse genes and genetic markers; it provides a curated set of mammalian homology records, user-defined chromosomal maps, experimental data sets and the definitive mouse 'gene to sequence' reference set for the research community. The integration and standardization of these data sets facilitates the transition between mouse DNA sequence, gene and phenotype annotations. A recent focus on allele and phenotype representations enhances the ability of MGD to organize and present data for exploring the relationship between genotype and phenotype. This link between the genome and the biology of the mouse is especially important as phenotype information grows from large mutagenesis projects and genotype information grows from large-scale sequencing projects.


Assuntos
Bases de Dados Factuais , Genoma , Camundongos/genética , Alelos , Animais , Marcadores Genéticos , Internet , Camundongos Endogâmicos , Alinhamento de Sequência
11.
Nucleic Acids Res ; 28(1): 108-11, 2000 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-10592195

RESUMO

The Mouse Genome Database (MGD) is a comprehensive public database of mouse genomic, genetic and phenotypic information (http://www. informatics.jax.org). This community database provides information about genes, serves as a mapping resource of the mouse genome, details mammalian orthologs, integrates experimental data, represents standardized mouse nomenclature for genes and alleles, incorporates links to other genomic resources such as sequence data, and includes a variety of additional information about the laboratory mouse. MGD scientists and annotators work cooperatively with the research community to provide an integrated, consensus view of the mouse genome while also providing experimental data including data conflicting with the consensus representation. Recent improvements focus on the representation of phenotypic information and the enhancement of gene and allele descriptions.


Assuntos
Bases de Dados Factuais , Genoma , Animais , Marcadores Genéticos , Internet , Camundongos , Terminologia como Assunto
12.
Lab Anim (NY) ; 29(3): 39-43, 2000 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-11375645

RESUMO

The Mouse Genome Database supports the use of mice in genome research, offering researchers information on gene characterization, genetic maps, comparative genomic data, and phenotypes.

14.
Nucleic Acids Res ; 27(1): 95-8, 1999 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-9847150

RESUMO

The Mouse Genome Database (MGD) focuses on the integration of mapping, homology, polymorphism and molecular data about the laboratory mouse. Detailed descriptions of genes including their chromosomal location, gene function, disease associations, mutant phenotypes, molecular polymorphisms and links to representative sequences including ESTs are integrated within MGD. The association of information from experiment to gene to genome requires careful coordination and implementation of standardized vocabularies, unique nomenclature constructions, and detailed information derived from multiple sources. This information is linked to other public databases that focus on additional information such as expression patterns, sequences, bibliographic details and large mapping panel data. Scientists participate in the curation of MGD data by generating the Chromosome Committee Reports, consulting on gene family nomenclature revisions, and providing descriptions of mouse strain characteristics and of new mutant phenotypes. MGD is accessible at http://www.informatics.jax.org


Assuntos
Bases de Dados Factuais , Genoma , Camundongos/genética , Animais , Mapeamento Cromossômico , Marcadores Genéticos/genética , Armazenamento e Recuperação da Informação , Internet , Camundongos/classificação , Fenótipo , Homologia de Sequência do Ácido Nucleico , Terminologia como Assunto , Interface Usuário-Computador , Vocabulário Controlado
15.
Methods ; 14(2): 179-90, 1998 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-9571075

RESUMO

Bioinformatics has become an essential part of biological research. The rapid pace of technology development and the ability to carry out biological experimentation in large scale require computerized systems for data management, analysis, and display. Experimentation with the mouse, a major model organism of the Human Genome Initiative, has intensified the need for bioinformatics tools for mouse mapping and genome analysis. This article describes the Mouse Genome Database in the United States, a primary resource for mouse genomic data, as well as resources at the Mammalian Genetics Unit in the United Kingdom and the Animal Genome Database of Japan. Internet addresses are provided for major genetic and physical mapping resources, major genome data sites, and resources of molecular information.


Assuntos
Mapeamento Cromossômico/métodos , Biologia Computacional , Animais , Redes de Comunicação de Computadores , Computadores , Citogenética , Bases de Dados como Assunto , Ligação Genética/genética , Genoma , Camundongos , Mapeamento por Restrição
16.
Pharmacogenetics ; 8(1): 33-42, 1998 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-9511179

RESUMO

The polymorphic human CYP2D6 has been co-expressed with human NADPH-cytochrome P450 oxidoreductase in Escherichia coli in order to generate a functional recombinant monooxygenase system for the study of xenobiotic metabolism. The two cDNAs were co-expressed from separate, compatible plasmids with different antibiotic selection markers. The CYP2D6 could be detected in bacterial cells at levels up to 700 nmol I-1 culture by Fe(2+)-CO versus Fe2+ difference spectroscopy, exhibiting the characteristic absorbance peak at 450 nm. Immunoblotting demonstrated the presence of both proteins in bacterial membranes, where they were expressed at levels significantly higher than those found in human liver microsomes. Membrane content was 150-200 pmol CYP2D6 (determined spectrally) and 100-230 pmol CYP-reductase (determined enzymatically) per mg protein. Critically, the two co-expressed proteins were able to couple to form a NADPH-dependent monooxygenase which metabolized the CYP2D6 substrate bufuralol (Vmax 3.30 nmol min-1 mg-1 protein; K(m) 11.1 microM) in isolated membrane fractions. This K(m) value was similar to the K(m) determined in human liver microsomes. Activity could be inhibited by the specific inhibitor quinidine. Of greater significance however, was the finding that intact E. coli cells, even in the absence of exogenous NADPH, were able to metabolize bufuralol at rates almost as high as those measured in membranes (4.6 +/- 0.4 min-1 versus 5.7 +/- 0.2 min-1 at 50 microM substrate). Such recombinant strains will greatly facilitate the molecular characterization of allelic variants of cytochrome P450 isoenzymes.


Assuntos
Citocromo P-450 CYP2D6/genética , Escherichia coli/genética , NADPH-Ferri-Hemoproteína Redutase/genética , Antagonistas Adrenérgicos beta/metabolismo , Sequência de Bases , Citocromo P-450 CYP2D6/metabolismo , Primers do DNA/genética , Etanolaminas/metabolismo , Expressão Gênica , Humanos , Técnicas In Vitro , Cinética , Metoprolol/metabolismo , NADPH-Ferri-Hemoproteína Redutase/metabolismo , Plasmídeos/genética , Polimorfismo Genético , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo
17.
Nucleic Acids Res ; 26(1): 130-7, 1998 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-9399817

RESUMO

The Mouse Genome Database (MGD) is a comprehensive community database that integrates genetic, genomic and phenotypic information about the laboratory mouse. MGD provides detailed information about genes and genetic markers, elemental data from mapping experiments, descriptions of molecular segments including ESTs, probes, and cDNA clones, homology information between mouse and many other mammalian genomes, and phenotypic descriptions of gene mutations, gene function and mouse strains. All data are supported by citations. Interactive graphical displays of cytogenetic, genetic and physical maps are available. User support is provided through dedicated staff, bulletin boards, and user documentation. MGD can be accessed at http://www.informatics.jax.org


Assuntos
Bases de Dados Factuais , Camundongos/genética , Animais , Mapeamento Cromossômico , Bases de Dados Factuais/tendências , Previsões , Marcadores Genéticos , Genoma , Humanos , Armazenamento e Recuperação da Informação , Sondas Moleculares , Ratos , Terminologia como Assunto
19.
Mol Mar Biol Biotechnol ; 6(1): 1-20, 1997 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-9116867

RESUMO

The phylogenetic position of the ancient family Pleurotomariidae within the Molluscan class Gastropoda, as well as the relationships of its Recent genera and species, were assessed using an iterative, two-gene (18S rDNA and cytochrome c oxidase I) approach to phylogeny reconstruction. In order to orient the Pleurotomariidae within Gastropoda, partial 18S rDNA sequences were determined for 7 pleurotomariid and 22 other gastropods that span the major groups within the class as well as for one cephalopod and two polyplacophorans, which serve as outgroups. Cladistic analyses of a sequence of approximately 450 base pairs (bp) near the 5' end of the 18S rDNA support the monophyly of the following higher gastropod taxa: Patellogastropoda, Vetigastropoda, Neritopsina, Apogastropoda, and its subclades Caenogastropoda and Heterobranchia. The 18S rDNA sequences and 579 bp of cytochrome c oxidase I (COI) analyzed separately and together, indicate that Pleurotomariidae are included within Vetigastropoda but comprise a clade that is the sister group to the other families referred to this order. Monophyly of the Pleurotomariidae is also supported by the unique presence of seven separate inserts (ranging in length from 1 to 68 bp) within the V2 variable region of the 18S RNA. Relationships of the genera and species within Pleurotomariidae are fully resolved using "total molecular evidence" consisting of partial sequences of 18S rDNA and COI and including data on length variation within the inserts.


Assuntos
DNA Ribossômico/química , DNA Ribossômico/genética , Complexo IV da Cadeia de Transporte de Elétrons/química , Complexo IV da Cadeia de Transporte de Elétrons/genética , Moluscos/classificação , Moluscos/genética , Filogenia , RNA Ribossômico 18S/genética , Animais , Sequência de Bases , Primers do DNA , Evolução Molecular , Dados de Sequência Molecular , Reação em Cadeia da Polimerase , Homologia de Sequência do Ácido Nucleico , Software
20.
Nucleic Acids Res ; 25(1): 85-91, 1997 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-9045213

RESUMO

The Mouse Genome Database (MGD) is a comprehensive community resource of mouse genetic and biological information populated both with data from published literature and with data electronically submitted from the research community. MGD stores genetic, physical and comparative mapping data, clones/probes/PCR information, and phenotype descriptions for genes, mutations and mouse strains. Supporting software for importation, analysis, display and distribution of mouse genetic data have been developed. User support is provided through dedicated staff providing documentation, training, and response to individual user queries. MGD is accessible over the Internet at URL http://www.informatics.jax.org.


Assuntos
Bases de Dados Factuais , Genoma , Camundongos/genética , Animais , Maine , Setor Privado
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...